39 research outputs found

    Interpolation and Extrapolation of Toeplitz Matrices via Optimal Mass Transport

    Full text link
    In this work, we propose a novel method for quantifying distances between Toeplitz structured covariance matrices. By exploiting the spectral representation of Toeplitz matrices, the proposed distance measure is defined based on an optimal mass transport problem in the spectral domain. This may then be interpreted in the covariance domain, suggesting a natural way of interpolating and extrapolating Toeplitz matrices, such that the positive semi-definiteness and the Toeplitz structure of these matrices are preserved. The proposed distance measure is also shown to be contractive with respect to both additive and multiplicative noise, and thereby allows for a quantification of the decreased distance between signals when these are corrupted by noise. Finally, we illustrate how this approach can be used for several applications in signal processing. In particular, we consider interpolation and extrapolation of Toeplitz matrices, as well as clustering problems and tracking of slowly varying stochastic processes

    A Parametric Method for Multi-Pitch Estimation

    Get PDF
    This thesis proposes a novel method for multi-pitch estimation. The method operates by posing pitch estimation as a sparse recovery problem which is solved using convex optimization techniques. In that respect, it is an extension of an earlier presented estimation method based on the group-LASSO. However, by introducing an adaptive total variation penalty, the proposed method requires fewer user supplied parameters, thereby simplifying the estimation procedure. The method is shown to have comparable to superior performance in low noise environments when compared to three standard multi-pitch estimation methods as well as the predecessor method. Also presented is a scheme for automatic selection of the regularization parameters, thereby making the method more user friendly. Used together with this scheme, the proposed method is shown to yield accurate, although not statistically efficent, pitch Estimates when evaluated on synthetic speech data

    Defining Fundamental Frequency for Almost Harmonic Signals

    Full text link
    In this work, we consider the modeling of signals that are almost, but not quite, harmonic, i.e., composed of sinusoids whose frequencies are close to being integer multiples of a common frequency. Typically, in applications, such signals are treated as perfectly harmonic, allowing for the estimation of their fundamental frequency, despite the signals not actually being periodic. Herein, we provide three different definitions of a concept of fundamental frequency for such inharmonic signals and study the implications of the different choices for modeling and estimation. We show that one of the definitions corresponds to a misspecified modeling scenario, and provides a theoretical benchmark for analyzing the behavior of estimators derived under a perfectly harmonic assumption. The second definition stems from optimal mass transport theory and yields a robust and easily interpretable concept of fundamental frequency based on the signals' spectral properties. The third definition interprets the inharmonic signal as an observation of a randomly perturbed harmonic signal. This allows for computing a hybrid information theoretical bound on estimation performance, as well as for finding an estimator attaining the bound. The theoretical findings are illustrated using numerical examples.Comment: Accepted for publication in IEEE Transactions on Signal Processin

    Zero-Shot Blind Audio Bandwidth Extension

    Full text link
    Audio bandwidth extension involves the realistic reconstruction of high-frequency spectra from bandlimited observations. In cases where the lowpass degradation is unknown, such as in restoring historical audio recordings, this becomes a blind problem. This paper introduces a novel method called BABE (Blind Audio Bandwidth Extension) that addresses the blind problem in a zero-shot setting, leveraging the generative priors of a pre-trained unconditional diffusion model. During the inference process, BABE utilizes a generalized version of diffusion posterior sampling, where the degradation operator is unknown but parametrized and inferred iteratively. The performance of the proposed method is evaluated using objective and subjective metrics, and the results show that BABE surpasses state-of-the-art blind bandwidth extension baselines and achieves competitive performance compared to non-blind filter-informed methods when tested with synthetic data. Moreover, BABE exhibits robust generalization capabilities when enhancing real historical recordings, effectively reconstructing the missing high-frequency content while maintaining coherence with the original recording. Subjective preference tests confirm that BABE significantly improves the audio quality of historical music recordings. Examples of historical recordings restored with the proposed method are available on the companion webpage: (http://research.spa.aalto.fi/publications/papers/ieee-taslp-babe/)Comment: Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processin

    An Adaptive Penalty Approach to Multi-Pitch Estimation

    Get PDF
    This work treats multi-pitch estimation, and in particular the common misclassification issue wherein the pitch at half of the true fundamental frequency, here referred to as a sub-octave, is chosen instead of the true pitch. Extending on current methods which use an extension of the Group LASSO for pitch estimation, this work introduces an adaptive total variation penalty, which both enforce group- and block sparsity, and deal with errors due to sub-octaves. The method is shown to outperform current state-of-the-art sparse methods, where the model orders are unknown, while also requiring fewer tuning parameters than these. The method is also shown to outperform several conventional pitch estimation methods, even when these are virtued with oracle model orders

    Modeling and Sampling of Spectrally Structured Signals

    No full text
    This thesis consists of five papers concerned with the modeling of stochastic signals, as well as deterministic signals in stochastic noise, exhibiting different kinds of structure. This structure is manifested as the existence of finite-dimensional parameterizations, and/or in the geometry of the signals' spectral representations. The two first papers of the thesis, Papers A and B, consider the modeling of differences, or distances, between stochastic processes based on their second-order statistics, i.e., covariances. By relating the covariance structure of a stochastic process to spectral representations, it is proposed to quantify the dissimilarity between two processes in terms of the cost associated with morphing one spectral representation to the other, with the cost of morphing being defined in terms of the solutions to optimal mass transport problems. The proposed framework allows for modeling smooth changes in the frequency characteristics of stochastic processes, which is shown to yield interpretable and physically sensible predictions when used in applications of temporal and spatial spectral estimation. Also presented are efficient computational tools, allowing for the framework to be used in high-dimensional problems.Paper C considers the modeling of so-called inharmonic signals, i.e., signals that are almost, but not quite, harmonic. Such signals appear in many fields of signal processing, not least in audio. Inharmonicity may be interpreted as a deviation from a parametric structure, as well as from a particular spectral structure. Based on these views, as well as on a third, stochastic interpretation, Paper C proposes three different definitions of the concept of fundamental frequency for inharmonic signals, and studies the estimation theoretical implications of utilizing either of these definitions. Paper D then considers deliberate deviations from a parametric signal structure arising in spectroscopy applications. With the motivation of decreasing the computational complexity of parameter estimation, the paper studies the implications of estimating the parameters of the signal in a sequential fashion, starting out with a simplified model that is then refined step by step.Lastly, Paper E studies how parametric descriptions of signals can be leveraged as to design optimal, in an estimation theoretical sense, schemes for sampling or collecting measurements from the signal. By means of a convex program, samples are selected as to minimize bounds on estimator variance, allowing for efficiently measuring parametric signals, even when the parametrization is only partially known

    Sparse Modeling of Harmonic Signals

    No full text
    This thesis considers sparse modeling and estimation of multi-pitch signals, i.e., signals whose frequency content can be described by superpositions of harmonic, or close-to-harmonic, structures, characterized by a set of fundamental frequencies. As the number of fundamental frequencies in a given signal is in general unknown, this thesis casts the estimation as a sparse reconstruction problem, i.e., estimates of the fundamental frequencies are produced by finding a sparse representation of the signal in a dictionary containing an over-complete set of pitch atoms. This sparse representation is found by using convex modeling techniques, leading to highly tractable convex optimization problems from whose solutions the estimates of the fundamental frequencies can be deduced.In the first paper of this thesis, a method for multi-pitch estimation for stationary signal frames is proposed. Building on the heuristic of spectrally smooth pitches, the proposed method produces estimates of the fundamental frequencies by minimizing a sequence of penalized least squares criteria, where the penalties adapt to the signal at hand. An efficient algorithm building on the alternating direction method of multipliers is proposed for solving these least squares problems.The second paper considers a time-recursive formulation of the multi-pitch estimation problem, allowing for the exploiting of longer-term correlations of the signal, as well as fundamental frequency estimates with a sample-level time resolution. Also presented is a signal-adaptive dictionary learning scheme, allowing for smooth tracking of frequency modulated signals.In the third paper of this thesis, robustness to deviations from the harmonic model in the form of inharmonicity is considered. The paper proposes a method for estimating the fundamental frequencies by, in the frequency domain, mapping each found spectral line to a set of candidate fundamental frequencies. The optimal mapping is found as the solution to a minimimal transport problem, wherein mappings leading to sparse pitch representations are promoted. The presented formulation is shown to yield robustness to varying degrees of inharmonicity without requiring explicit knowledge of the structure or scope of the inharmonicity.In all three papers, the performance of the proposed methods are evaluated using simulated signals as well as real audio

    Online Estimation of Multiple Harmonic Signals

    No full text
    In this paper, we propose a time-recursive multipitch estimation algorithm using a sparse reconstruction framework, assuming that only a few pitches from a large set of candidates are active at each time instant. The proposed algorithm does not require any training data, and instead utilizes a sparse recursive least-squares formulation augmented by an adaptive penalty term specifically designed to enforce a pitch structure on the solution. The amplitudes of the active pitches are also recursively updated, allowing for a smooth and more accurate representation. When evaluated on a set of ten music pieces, the proposed method is shown to outperform other general purpose multipitch estimators in either accuracy or computational speed, although not being able to yield performance as good as the state-of-the art methods, which are being optimally tuned and specifically trained on the present instruments. However, the method is able to outperform such a technique when used without optimal tuning, or when applied to instruments not included in the training data
    corecore